Inferring 3D Scene Structure from a Single Still Image
نویسندگان
چکیده
In this project, we revisit the problem of constructing 3D structures from single still images. We build upon previous work done by Saxena, Sun and Ng [1] by improving on the inference techniques used in their algorithm, with the goal of producing 3D models that are more quantitatively accurate, as well as more visually pleasing. One area of improvement in the existing algorithm is the penalty function used during MAP inference of plane parameters. When inferring 3D models from single still images, the penalty function is used to enforce constraints such as connectedness, co-planarity, and co-linearity. Properties of the penalty function consequently determine whether the transition between two planes in the 3D model is smooth or sharp. The current penalty function, the L1 norm of the error, does not prefer either a smooth transition or a sharp transition. As a result, the resulting 3D models often have walls sloping away from the ground, rather than standing straight up. Our goal was to find a suitable penalty function that prefers a sharp transition over a smooth transition.
منابع مشابه
A Close-Form Iterative Algorithm for Depth Inferring from a Single Image
Inferring depth from a single image is a difficult task in computer vision, which needs to utilize adequate monocular cues contained in the image. Inspired by Saxena et al’s work, this paper presents a closeform iterative algorithm to process multi-scale image segmentation and depth inferring alternately, which can significantly improve segmentation and depth estimate results. First, an EM-base...
متن کاملRAPTOR Technical Report
In this technical report we present RAPTOR (Rapid Three-Dimensional Orientation Resolver), which is a novel pipeline for inferring solely from 2D image data the 3D position and orientation (pose) of known classes of rigid objects for which man-made 3D models are available. There are many existing systems and techniques that attempt to infer 3D meshes and scene information from 2D image or video...
متن کاملComplete 3D Scene Parsing from Single RGBD Image
Inferring the location, shape, and class of each object in a single image is an important task in computer vision. In this paper, we aim to predict the full 3D parse of both visible and occluded portions of the scene from one RGBD image. We parse the scene by modeling objects as detailed CAD models with class labels and layouts as 3D planes. Such an interpretation is useful for visual reasoning...
متن کاملSeeThrough: Finding Chairs in Heavily Occluded Indoor Scenes
3D geometry mockups of single images of indoor scenes are useful for many applications including interior design, content creation for virtual reality, and image manipulation. Unfortunately, manually modeling a scene from a single image is tedious and requires expert knowledge. We aim to construct scene mockups from single images automatically. However, automatically inferring 3D scenes from 2D...
متن کامل3D Shape from Anisotropic Diffusion
We cast the problem of inferring the 3D shape of a scene from a collection of defocused images in the framework of anisotropic diffusion. We propose a novel algorithm that can estimate the shape of a scene by inferring the diffusion coefficient of a heat equation. The method is optimal, as we pose it as the minimization of a certain cost functional based on the input images, and fast. Furthermo...
متن کامل